Continual Learning with Scaled Gradient Projection
نویسندگان
چکیده
In neural networks, continual learning results in gradient interference among sequential tasks, leading to catastrophic forgetting of old tasks while new ones. This issue is addressed recent methods by storing the important spaces for and updating model orthogonally during tasks. However, such restrictive orthogonal updates hamper capability resulting sub-optimal performance. To improve minimizing forgetting, this paper we propose a Scaled Gradient Projection (SGP) method, where combine projections with scaled steps along past The degree scaling these depends on importance bases spanning them. We an efficient method computing accumulating using singular value decomposition input representations each task. conduct extensive experiments ranging from image classification reinforcement report better performance less training overhead than state-of-the-art approaches.
منابع مشابه
Scaled Gradient Descent Learning Rate
Adaptive behaviour through machine learning is challenging in many real-world applications such as robotics. This is because learning has to be rapid enough to be performed in real time and to avoid damage to the robot. Models using linear function approximation are interesting in such tasks because they offer rapid learning and have small memory and processing requirements. Adalines are a simp...
متن کاملGradient Episodic Memory for Continual Learning
One major obstacle towards artificial intelligence is the poor ability of models to quickly solve new problems, without forgetting previously acquired knowledge. To better understand this issue, we study the problem of learning over a continuum of data, where the model observes, once and one by one, examples concerning a sequence of tasks. First, we propose a set of metrics to evaluate models l...
متن کاملA Scaled Gradient Projection Method for Constrained Image Deblurring
A class of scaled gradient projection methods for optimization problems with simple constraints is considered. These iterative algorithms can be useful in variational approaches to image deblurring that lead to minimize convex nonlinear functions subject to nonnegativity constraints and, in some cases, to an additional flux conservation constraint. A special gradient projection method is introd...
متن کاملConstrained Stress Majorization Using Diagonally Scaled Gradient Projection
Constrained stress majorization is a promising new technique for integrating application specific layout constraints into forcedirected graph layout. We significantly improve the speed and convergence properties of the constrained stress-majorization technique for graph layout by employing a diagonal scaling of the stress function. Diagonal scaling requires the active-set quadratic programming ...
متن کاملA scaled gradient projection method for Bayesian learning in dynamical systems
A crucial task in system identification problems is the selection of the most appropriate model class, and is classically addressed resorting to cross-validation or using order selection criteria based on asymptotic arguments. As recently suggested in the literature, this can be addressed in a Bayesian framework, where model complexity is regulated by few hyperparameters, which can be estimated...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i8.26157